The maximum penalty criterion for ridge regression: application to the calibration of the force constant in elastic network models.

نویسندگان

  • Yves Dehouck
  • Ugo Bastolla
چکیده

Tikhonov regularization, or ridge regression, is a popular technique to deal with collinearity in multivariate regression. We unveil a formal analogy between ridge regression and statistical mechanics, where the objective function is comparable to a free energy, and the ridge parameter plays the role of temperature. This analogy suggests two novel criteria for selecting a suitable ridge parameter: specific-heat (Cv) and maximum penalty (MP). We apply these fits to evaluate the relative contributions of rigid-body and internal fluctuations, which are typically highly collinear, to crystallographic B-factors. This issue is particularly important for computational models of protein dynamics, such as the elastic network model (ENM), since the amplitude of the predicted internal motion is commonly calibrated using B-factor data. After validation on simulated datasets, our results indicate that rigid-body motions account on average for more than 80% of the amplitude of B-factors. Furthermore, we evaluate the ability of different fits to reproduce the amplitudes of internal fluctuations in X-ray ensembles from the B-factors in the corresponding single X-ray structures. The new ridge criteria are shown to be markedly superior to the commonly used two-parameter fit that neglects rigid-body rotations and to the full fits regularized under generalized cross-validation. In conclusion, the proposed fits ensure a more robust calibration of the ENM force constant and should prove valuable in other applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generalized Ridge Regression Estimator in Semiparametric Regression Models

In the context of ridge regression, the estimation of ridge (shrinkage) parameter plays an important role in analyzing data. Many efforts have been put to develop skills and methods of computing shrinkage estimators for different full-parametric ridge regression approaches, using eigenvalues. However, the estimation of shrinkage parameter is neglected for semiparametric regression models. The m...

متن کامل

Bayesian Quantile Regression with Adaptive Elastic Net Penalty for Longitudinal Data

Longitudinal studies include the important parts of epidemiological surveys, clinical trials and social studies. In longitudinal studies, measurement of the responses is conducted repeatedly through time. Often, the main goal is to characterize the change in responses over time and the factors that influence the change. Recently, to analyze this kind of data, quantile regression has been taken ...

متن کامل

On Calibration and Application of Logit-Based Stochastic Traffic Assignment Models

There is a growing recognition that discrete choice models are capable of providing a more realistic picture of route choice behavior. In particular, influential factors other than travel time that are found to affect the choice of route trigger the application of random utility models in the route choice literature. This paper focuses on path-based, logit-type stochastic route choice models, i...

متن کامل

Comparison of Efficiency for ‎Hydrological Models (AWBM & ‎SimHyd) and Neural Network (MLP & ‎RBF) in Rainfall–Runoff Simulation ‎(Case study: Bar Aryeh Watershed ‎‌-‌Neyshabur)‎

For suitable programming and management of water resources, access to perfect information from the discharge at the watershed outlet is essential. In most watersheds, the hydrometric station is not available; then, different models are used to simulate the discharge within watersheds without data. The selection of preferred model for rainfall- runoff simulation depends to the purpose of modelin...

متن کامل

Spatial Regression in the Presence of Misaligned data

In this paper, four approaches are presented to the problem of fitting a linear regression model in the presence of spatially misaligned data. These approaches are plug-in method‎, ‎simulation‎, ‎regression calibration and maximum likelihood‎. In the first two approaches‎, ‎with modeling the correlation between the explanatory variable, prediction of explanatory variable is determined at sites...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Integrative biology : quantitative biosciences from nano to macro

دوره 9 7  شماره 

صفحات  -

تاریخ انتشار 2017